image

Introduction

With the increasing integration of technology within people’s lives, fewer people are turning to books as a source of entertainment. When the COVID pandemic hit, people finally turned back to books after a steady decline in the number of readers slowly lowering since the beginning of the twenty-first century. With everyone stuck at home with not much better to do, book sales saw a massive increase, but especially eBooks, considering they’re more portable and accessible. In the young adult setting, the creation of BookTok brought many readers back to the hobby as well and introducing new readers through this online community.

This report looks over Leigh Bardugo’s Grishaverse novels and their trends in popularity over the years, as not only is she one of my personal favorite authors, but she is one of the most prominent young adult fantasy authors right now. This report hopes to discover:

This dataset is filtered to only include the seven Grishaverse novels (Shadow and Bone, Siege and Storm, Ruin and Rising, Six of Crows, Crooked Kingdom, King of Scars, and Rule of Wolves).

Summary

Write a summary paragraph of findings that includes the 5 values calculated from your summary information R script

These will likely be calculated using your DPLYR skills, answering questions such as:

Feel free to calculate and report values that you find relevant.

The Dataset

Seattle Public Library collects their monthly checkout data and publishes it for the public. Some of the parameters in this data set include, title, publication date, checkout month and year, genre, creator, material type, and number of checkouts among other examples. The data is generated from a variety of sources, including: Overdrive, hoopla, Freegal, and RBDigital which provide electronic data for digital items while physical item checkout data was sourced from the Lebgrady artwork data archives from April 2005 to September 2016. Currently, physical item checkout data is collected from the Horizon ILS. This data set essentially acts as a monthly record of library checkouts, holding hundreds of thousands of checkouts over the years. In a sense, it acts as a marker of history from 2005 until the present.

At first glance, this data set is quite messy; certain parameters are formatted strangely, such as Publication Year which has around seven different potential formats that each mean different things. Many of the same books are referenced under different titles depending on the different material type. In addition, other fields have such a wide variety of different answers that it may be difficult to find trends that don’t occur over such a vast margin, such as when sorting by subjects. There are gaps in the data here and there and plenty of NA values, and given the size of the data set, there is quite a lot to sort through.

There are certain variables that would’ve been useful to track alongside what has already been collected. For example, we may be able to draw conclusions about a book’s quality or success based on the number of checkouts, but there isn’t any other statistic provided that may strengthen those assumptions such as an average rating from Goodreads or something along those lines. We have no way of knowing the demographics of the people who are checking out items, as all that is tracked is information about the items themselves, so we cannot draw any conclusions about people or culture in relation to checkouts without making inferences. Ultimately, this data set is just about library checkout data, so its scope only exists within that range.

Your Choice

The last chart is up to you. It could be a line plot, scatter plot, histogram, bar plot, stacked bar plot, and more. Here are some requirements to help guide your design:

Here’s an example of how to run an R script inside an RMarkdown file:

#{r, echo = FALSE, code = readLines("chart2_example.R")} #